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ABSTRACT 


Suppose  that  we  are  given  a  set  of  n  elements  e, ,  ....  e  which  are 

1  n 

to  be  arranged  in  some  order.  At  each  unit  of  time  a  request  is  made 

to  retrieve  one  of  these  elements  -  e^  being  requested  (independently 

n 

of  the  past)  with  probability  P.  ,  P  >  0  ,  £  P  *  1  .  The  cost  of  the 

1  1 

retrieval  is  taken  to  be  the  ordered  position  of  the  element  requested. 


The  problem  of  interest  is  to  determine  the  optimal  ordering  so  as  to 
minimize  the  long  run  average  cost.  Clearly  if  the  P  were  known 
the  optimal  ordering  would  simply  be  to  order  the  elements  in  decreasing 
order  of  the  P^'s  •  In  fact  even  if  the  P^'s  were  unknown  we  could 
do  as  well  asymptotically  by  ordering  the  elements  at  each  unit  of  time 
in  decreasing  order  of  the  number  of  previous  requests  for  them.  In 
this  paper  we  first  consider  the  case  in  which  the  only  memory  allowed 
at  any  time  is  the  ordering  of  the  elements  at  that  time;  in  other  words, 
the  only  type  of  reordering  rules  we  allow  are  ones  in  which  the  reordered 
permutation  of  elements  at  any  time  is  only  allowed  to  depend  on  the 
present  ordering  and  the  position  of  the  element  requested.  We  show  that 
the  rule  which  always  moves  the  requested  element  one  closer  \to  the  front 
of  the  line  minimizes  the  average  position  of  the  element  requested  among 
a  wide  class  of  rules  for  all  probability  vectors  of  the  fo'nu  P^  =  p  , 
p  =  ...  »  p  3  -  — -  -E  .  in  fact,  we  establish  this  under  a  stronger 
optimality  condition  -  namely  the  criterion  of  stochastically  minimizing 
the  asymptotic  position  of  the  element  requested. 


7  ?'" 


‘We  also  consider  the  above  problem  under  the  previse  that  additional  memory 
is  allowed.  In  particular  we  allow  the  decision-maker  to  utilize  such 
rules  as  "only  make  a  change  (according  to  some  preassigned  rule)  if  the 
same  element  has  been  requested  k  times  in  a  row."  We  show  that  as  k 
approaches  infinity  we  can  do  as  well  as  if  we  knew  the  values  of  the  P^  , 
and  in  addition  we  show  that  the  convergence  is  monotone. 


i  ^  n  /> 


We  then  allow  for  the  possibility  of  randomization.  We  first  consider 
policies  which  at  every  unit  of  time  follow  some  given  rule  with  probability 
a  and  do  nothing  (make  no  reordering)  with  probability  1  -  a  ;  and  show 
that  their  average  costs  are  independent  of  a  .  However  if  we  allow 
the  randomization  constant  to  be  a  function  of  the  position  of  the  element 
requested  (one  instance  would  be  a  policy  which  when  the  element  selected 


is  in  position  i  moves  it  to  the  front  with  probability 


and  leaves 


the  ordering  unchanged  with  probability  1  -  a^)  then  the  average  cost 
depends  on  the  sequence  of  randomization  constants.  Interestingly 
enough  this  is  not  the  case  for  the  one-closer  rule  whose  average  cost 
remains  invariant  under  such  randomization. 
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OPTIMAL  LIST  ORDER  UNDER  PARTIAL  MEMORY  CONSTRAINTS 


by 

Y.  C.  Kan  and  S.  M.  Ross 


0.  INTRODUCTION  AND  SUMMARY 

Suppose  chat  we  are  given  a  set  of  n  elements  e. ,  e  which 

1  n 

are  to  be  arranged  in  some  order.  At  each  unit  of  time  a  request  is  made 

to  retrieve  one  of  these  elements  -  e^  being  requested  (independently 

n 

of  the  past)  with  probability  P.  ,  P  >_  0  ,  £  P  -  1  .  The  cost  of  the 

xi  L  1 

retrieval  is  taken  to  be  the  ordered  position  of  the  element  requested. 

The  problem  of  interest  is  to  determine  the  optimal  ordering  so  as 
to  minimize  the  long  run  average  cost.  Clearly  if  the  P  were  known 
the  optimal  ordering  would  simply  be  to  order  the  elements  in  decreasing 
order  of  the  P^'s  •  In  fact  even  if  the  P^'s  were  unknown  we  could 
do  as  well  asymptotically  by  ordering  the  elements  at  each  unit  of  time 
in  decreasing  order  of  the  number  of  previous  requests  for  them.  However 
the  problem  becomes  more  interesting  if  we  do  not  allow  such  memory  storage 
as  would  be  necessary  for  the  above  rule  but  rather  restrict  ourselves  to 
a  more  limited  memory  storage.  In  [  5  ]  the  case  was  considered  where  the 
only  memory  allowed  at  any  time  was  the  ordering  of  the  elements  at  that 
time;  in  other  words  the  only  type  of  reordering  rules  allowed  in  [  5  ] 
are  ones  in  which  the  reordered  permutation  of  elements  at  any  time  is 
only  allowed  to  depend  on  the  present  ordering  and  the  position  of  the 
element  requested  -  we  call  such  rules  no-memorv  rules.  A  no-memory 
rule  was  said  to  be  optimal  in  [  5  ]  if  its  average  cost  as  a  function  of 
the  probability  vector  £  is  minimal  among  all  rules  for  every  probability 
vector  P  having  0  <  P.  <  1  ,  i  ■  1,  ...,n.  Whereas  it  is  not  obvious 
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chat  an  optimal  rule  exists  it  was  conjectured  in  [  5  ]  that  the  rule  which 
always  moves  the  requested  element  one  position  closer  to  the  front  (called 
the  transposition  rule)  is  optimal.  Though  this  conjecture  was  not  proved 
it  was  shown  in  [  5  ]  chat  the  transposition  rule  always  has  a  smaller 

average  cost  than  the  one  which  moves  the  requested  element  to  the  front 

* 

of  the  line. 

In  Section  1  of  the  paper  we  consider  the  above  problem  under  the  pre¬ 
mise  that  additional  memory  is  allowed.  In  particular  we  allow  the  decision¬ 
maker  to  utilize  such  rules  as  "only  make  a  change  (according  to  some 
preassigned  rule)  if  the  same  element  has  been  requested  k  times  in  a  row." 
We  show  that  as  k  approaches  infinity  we  can  do  as  well  as  if  we  knew  the 
values  of  the  P  ,  and  in  addition  we  show  that  the  convergence  is  monotone. 

In  Section  2  we  allow  for  the  possibility  of  randomization.  We  first 
consider  policies  which  at  every  unit  of  time  follow  some  given  (nonmemory 
and  nonrandoraized)  rule  with  probability  a  and  do  nothing  (make  no  re¬ 
ordering)  with  probability  1  -  a  ;  and  show  that  their  average  costs  are 
independent  of  a  .  However  if  we  allow  the  randomization  constant  to  be  a 
function  of  the  position  of  the  element  requested  (one  instance  would  be  a 
policy  which  when  the  element  selected  is  in  position  i  moves  it  to  the 
front  with  probability  a^  and  leaves  the  ordering  unchanged  with  probability 
1  -  a^)  then  the  average  cost  depends  on  the  sequence  of  randomization 
constants.  Interestingly  enough  this  is  not  the  case  for  the  (conjectured 
optimal)  transposition  rule  whose  average  cost  remains  invariant  under  such 
randomization. 


If  the  present  ordering  is  e^e^.e^e,  and  element  e^  is  requested  then 
the  transposition  rule  leads  to  the  new  ordering  e]ye3>e2,e4  whereas  the 
front  of  the  line  rule  leads  to  e^.e^.e^.e^  . 
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In  the  final  section  we  consider  the  original  model  where  the  only 

rules  allowed  are  ones  whose  reordering  is  based  on  the  present  ordering 

and  the  position  of  the  element  requested.  We  show  that  the  transposition 

rule  is  optimal  among  a  wide  class  of  rules  for  all  probability  vectors 

of  the  form  P,  =  p  ,  P„  =  • • •  =  P  =  ^  ~  ^  .  In  fact  we  establish  this 
1  2  n  n  -  1 

under  a  stronger  optimality  condition  -  namely  the  criterion  of  stochastically 
minimizing  the  asymptotic  position  of  the  element  requested. 


4 


1.  K-IN-A-ROW  POLICIES 

Consider  any  rule  R  which  after  each  request  reorders  the  list 
solely  as  a  function  of  the  present  ordering  and  the  position  of  the  element 
requested  and  suppose  now  that  we  are  allowed  to  follow  the  policy  where 
we  only  make  a  change  in  the  list  order  (according  to  rule  R)  if  the  same 
element  has  been  requested  k  times  in  a  row.  (Such  policies  would  require 
two  additional  counters  of  memory  space  -  one  for  keeping  track  of  the 
last  element  requested  and  the  other  keeping  track  of  the  number  of  times 
in  a  row  it  had  been  requested.)  Once  an  element  has  been  requested  k 
times  in  a  row  we  reorder  the  list  according  to  R  and  then  start  over 
again  as  far  as  waiting  for  another  run  of  k  identical  requests. 

The  sequence  of  list  orderings  which  result  under  the  above  policy 
can  most  easily  be  analyzed  as  a  semi-Markov  process  with  the  state  at 
any  time  being  the  ordering  at  that  time  and  the  epochs  of  transition 
being  the  times  at  which  a  run  of  k  identical  requests  have  occurred. 

We  start  by  computing  the  probability  that  any  given  run  of  k  identical 
requests  were  all  requests  for  element  i  , 

Proposition  1.1: 

Given  a  sequence  of  independent  multinomial  trials  -  each  resulting 

n 

in  outcome  i  with  probability  p  ,  £  p.  =  1  .  Then  the  probability 

1  1  ‘  1 

that  a  run  of  k^  successive  trials  all  resulting  in  outcome  number  1 
occurs  before  any  run  of  k^  successive  i  outcomes,  i  =  2,  ...,  n 
equals 


I 
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Proof : 


We  first  compute  the  expected  number  of  coin  tosses,  call  it  E [ T ]  , 
until  a  run  of  N  successive  heads  occur  when  the  tosses  are  independent 
and  each  lands  on  heads  with  probability  p  .  By  conditioning  on  the 
time  of  the  first  nonhead  we  obtain 


r  1  - 1  \ 

E[T]  -  l  (1  -  P)PJ  (j  +  E[T] )  +  Np' 

J-l 


Solving  the  above  for  E ( T ]  yields 


E[T]  »  N  + 


(1  ~  p) 


l  JPJ 


and,  simplifying,  we  obtain 


1  +  p  + 


v 

U  -  P  > 
pN’(l  -  p) 


Now  consider  the  (infinite)  sequence  of  multinomial  trials  as  specified 
in  the  statement  of  the  proposition.  Let  us  say  that  an  i-success  occurs 
whenever  we  obtain  a  run  of  k.  successive  i  outcomes.  Then  bv  renewal 

l 

theory  the  rate  of  i-successes  is  just  1  divided  by  the  expected  time 
between  i-successes  and  so  the  proportion  of  successes  that  are  i 
successes  is  (with  probability  1) 


«V  '  j:,  1/E(V 


k  /  k .  v 

p/u  -  rt)/(i  -  p,1) 
n  /  k.\ 

^  pj  (1  -  -  pj  ) 


where  is  the  time  between  i-successes. 
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But  each  time  a  success  (that  is  an  i-success  for  any  i)  occurs  every¬ 
thing  restarts  itself  and  so  the  limiting  proportion  of  successes  that  are 
of  type  i  must  also  equal  the  probability  that  an  i-success  occurs  before 
any  j-success,  j  #  i  . |  j 


Now  in  a  semi-Markov  process  if  we  let  denote  the  limiting 

probability  of  being  in  state  i  for  the  embedded  Markov  chain  which  looks 
at  the  process  only  when  transitions  occur  and  we  let  „  denote  the  mean 
time  until  the  next  transition  when  in  state  i  then  the  limiting  propor¬ 
tion  of  time  the  state  is  i  equals  “j  ‘  Hence  since  in  our 

problem  the  mean  time  spent  in  any  state  is  constant — it  is  just  the 
expected  time  to  obtain  a  run  of  k  requests  for  the  same  element — it 
follows  that  the  limiting  proportion  of  time  spent  in  each  state  is  equal  to 
the  limiting  probabilities  for  the  embedded  Markov  chain  which  only  considers 
the  successive  orderings  when  transitions  (i.e.,  runs  of  k  in  a  row)  occur. 

Thus  it  follows  that  the  performance  of  the  policy  which  uses  rule  R 

only  when  there  have  been  k  requests  in  a  row  for  the  same  element  is 

exactly  the  same  as  the  performance  of  rule  R  in  the  case  where  the  request 

probabilities  are  no  longer  P^,  ...»  P  but  rather  are  now  given  by 

P(k) . P(k)  where 

1  n 


plU  -  Pt)/(l  -  P*) 

X  p?a  -  PJ)/(1  -  p?) 


The  next  lemma  shows  that  as  k  increases  the  proportion  of  requests 
(in  the  embedded  chain)  for  the  element  having  the  largest  request  probability 
increases  to  1;  among  the  remaining  requests  the  proportion  of  those  that  are 
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Proof : 

x  i 

cl  ”  J. 

The  derivative  of  -  will  be  negative  if 

bx  -  1 

aXlna  bX£nb 
aX  -  1  bX  -  1 

bXlnb 

but  this  follows  from  the  fact  that  -  is  an  increasing  function 

bX  -  1 

of  b  when  b  >  1  ,  which  is  easily  established  upon  differentiation. j  | 
Theorem  1.4: 

Let  R  be  any  rule  which  moves  the  element  requested  strictly  closer 
to  the  front  (unless  it  is  already  in  the  initial  position  in  which  case 
it  remains  there)  and  leaves  the  relative  ordering  of  the  other  elements 
unchanged.  Then  under  the  policy  which  follows  rule  R  only  when  there 
have  been  k  requests  in  a  row  for  the  same  element  the  proportion  of  time 
the  element  with  the  j  largest  request  probability  is  in  position  j 
goes  to  1  as  k  goes  to  00  . 

Proof : 

Suppose  the  elements  are  numbered  so  that  P,  >  P„  >  •••  >  P 

12  n 

By  Lemma  1.2  it  follows  that  the  proportion  of  reorderings  that  result 
in  element  1  being  moved  closer  to  the  front  of  the  line  goes  to  1  as  k 
becomes  large.  Hence  it  follows  that  the  proportion  of  time  that  element 
1  is  in  position  1  also  goes  to  1  as  k  gets  larger.  The  remainder  of 
the  argument  is  similar. |  j 


Thus  wo  see  that  is  k  goes  tv'  Infinity  tlio  proportion  ot  t  ime  tli.it 


o 


the  ordered  list  cot  responds  tv'  the  optimal  ordering  when  the  I',' 
known  goes  tv'  1.  Hence  the  average  cost  under  anv  v't  the  policies 
in  Theorem  l.s  converges  tv'  what  the  average  cost  would  he  it  the 


s  are 


spec  i !  1  ovl 


T ( ’ s  are 


known  -  namelv  '  IT  . .  where  l’,.  is  the  1 1  *’  largest  of  T . 1' 

U)  U)  1  « 

Krom  the  results  v't  Lemma  1.2  It  would  also  seem  reasonable  that  this  v on 
vergence  would  he  monotone.  We  will  verify  this  monotone  convergence  t oi 
the  easiest  rule  tv'  analyse,  namelv  the  one  which  moves  the  requested 
element  tv'  the  tront  v't  the  lino. 

Now  under  the  front  v't  the  line  rule  it  the  elements  have  probabilities 

!'l . P  then  the  expected  position  (with  respect  tv'  the  limiting 

distribution)  ol  the  element  requested  can  he  expressed  as 


Average  Lost 


n 

V 

l-l 


n 

r 

J-i 


n 

V 

l-l 


K  (  pos  1  t  1  on  v't  o  l  emont 


P  '  I'l  i  precedes  j 1 
’  if4.) 

V  p  p  /ip  I  i'l 

IJI  'l  1  I 


)} 


where  we  have  used  the  fact  that  I’ll  precedes  jl  is  the  probability 
that  after  i  long  time  the  most  recent  request  tor  either  i  v't  j  was 
for  i  ,  which  is  easily  seen  tv'  equal  l'.  tl'.  *  P  ^  .  The  above  formula 
was  derived  in  [  l  ) ,  (.11,  |  •«  ]  and  )  s  ]. 

Hence  we  want  t o  show  that 

l  v  r'kV;k',  v  r;k'l 

1  .  1  4  4  \  '  *  '  ' 


4  t  \\  k 


To  prove  this  we  first  introduce  the  concepts  of  majorization  and 

Schur  functions.  We  say  the  vector  x  =  (x^,  x^)  majorizes  the 

vector  \_  =  (y^,  ....  y  )  ,  written  as  x  >  j  if 

m 


i=l 


k(i) 


>  i 


i=i 


(i) 


3  =  1, 


n  -  1 


and 


Ji  x‘»  '  ji 

t  h 

where  x^  ,  y^^  are  the  i  largest  values  of  x^f  x  and 

y^,  y^  respectively. 

The  symmetric  function  f  is  said  to  be  a  Schur  concave  function  if 

f  (x)  £,  f  (^)  whenever  jc  >_  yr_  .  The  following  criterion  for  determining 

ra 

if  a  function  is  Schur  concave  is  due  to  Ostrowski. 


Theorem:  (Ostrowski) 


A  differentiable  symmetric  function  f  is  Schur  concave  if  and  only 


(x.  -  x.) 


3f(x)  3f(x) 


1  2  l  3x, 


3x, 


<  0  for  all  x 


Proof : 

See  [  2  ] ,  p.  47. 


We  are  now  ready  for 


Proposition  1.5: 
The  function 


H<*>  -  l  l  x(x./(xt  +  x,) 

J-l  if*j  1  J  1  J 


x 


i 


>  0 


is  a  Schur  concave  function. 


Proof : 


3H(x) 


xj/u 


1 


+  V 


Therefore 


As  we  see  Chat  this  is  iionposLcivc, 


che  reaul t 


follows  from  the 


theorem. | | 


/ (x2  +  x  )‘ 


Ostrowski 
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Theorem  1.6: 

If  Che  front  of  Che  line  rule  is  only  utilized  when  the  same  element 
has  been  requested  k  times  in  a  row  then  the  average  cost  of  this  policy, 

namely  £  7  p(^)p(k)/ (p(k)  +  p(k)\  ^  a  decreasing  function  of  k  . 

j-1  i^j  J  1  '  J  17 

Proof : 

This  result  will  follow  from  the  previous  proposition  upon  showing 

that 

(k+1)  (k+1)  (k)  (k) 

1  8  * "  ’  n  -  1  .  n 

m 

Assume  without  loss  of  generality  that  the  elements  are  numbered  so  that 

(k) 

P^  ?2  i  *  * ‘  2.  which  will  imply  the  same  orderings  for  the  vector  P 
Let 


U) 


,U) 


i  =  1, 


J-i 


and  note  that  from  Lemma  1.2,  we  have  that 


.(k+1)  .(k)  . 

>_  a  ,  i  -  1,  . . . ,  n 


As  it  is  easy  to  establish  (by  induction  on  j)  that 


n  (i  -  x[l))  =  l  *  i  *  1 . n 

i*l  '  '  i-j+1  1 


the  result  follows. 


2.  RANDOMIZATION 


The  reason  chat  Che  k-in-a-row  policies  do  better  than  the  no-memory 
reordering  rules  is  that  these  latter  rules  make  changes  too  frequently. 

For  instance  if  we  were  allowed  perfect  memory  then  the  best  rule  would  be  to 
order  the  elements  in  decreasing  order  of  the  number  of  requests  for  them. 
Hence,  after  a  while,  reorderings  become  infrequent  -  for  instance  if  n  =  4 
and  the  total  number  of  requests  for  elements  1  through  4  are  at  present 
20,  60,  10,  80  then  the  optimal  ordering  would  be  e^,e2,e^,e^  and  would 
remain  so  for  at  least  the  next  10  periods  regardless  of  the  elements 
requested  in  this  time  span. 

Another  approach  to  slowing  down  changes  in  list  order  is  to  allow  for 
randomized  policies.  In  particular  consider  any  no-memory  rule  R  and 
consider  the  policy  which  when  the  element  requested  is  in  position  i 
follows  the  dictates  of  rule  R  with  probability  a^  and  leaves  the 
present  ordering  unchanged  with  probability  1  -  a^  ,  for  given  a^  , 
0<_a^<^l,i*l,  ...,n.  We  first  note  that  if  the  randomization  value 
a^  is  the  same  for  all  i  ,  say  =  a  ,  then  the  average  cost  for  the 
randomized  policy  is  the  same  as  that  of  the  original  rule  R  . 

Proposition  2.1: 

If  a^  =  a  ,  i  »  1 ,  . . . ,  n  then  the  average  cost  of  the  randomized 
policy  is  independent  of  a  . 

Proof : 

We  can  analyze  the  sequence  of  orderings  as  a  semi-Markov  process  where 
a  transition  occurs  whenever  the  outcome  of  the  randomization  results  in 
rule  R  being  followed.  As  this  occurs  with  probability  a  independent 
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of  che  particular  order  it  follows  Chat  the  mean  time  spent  in  each  state 
during  a  visit  is  1/a  for  every  state.  Hence  che  limiting  probabilities 
are  exactly  the  same  as  those  of  the  embedded  Markov  chain  which  considers 
the  orderings  only  at  times  where  R  is  followed.  As  these  limiting 
probabilities  are  clearly  che  same  as  when  a  =■  1  the  result  follows. ]| 

In  general  the  average  cost  of  a  randomized  policy  based  on  the  rule  R 
will  depend  on  the  values  of  the  a^  .  However  an  interesting  exception  is 
when  R  is  taken  to  be  the  (conjectured  optimal)  transposition  rule.  For 
this  case  we  first  note  the  following  lemma  which  was  also  proved  in  (  5  ] 
for  the  special  case  a^  =  1  . 

Lemma  2.2: 

For  the  randomized  policy  based  on  the  transposition  rule  and  using 
randomization  constants  v  ,  i  =  1,  ...,  n  , 


(2.1)  P  Pr  (i . I.,!..,,  ....  i  )  -  P,  Pr  (i. . . i 

ij+1  1  j  n  1  j+1  j  n 


where  i^,  . . . ,  i^  is  any  permutation  of  1,2,  ....  n  and  Pr  (i^,  ....  i^) 
is  the  stationary  probability  that  the  list  order  is  (i^,  ....  i^)  given 
chat  the  stated  policy  is  employed. 


Proof : 

By  multiplying  both  sides  of  Equation  (2.1)  by  we  see  c^at 

(2.1)  is  equivalent  to  stating  that  rate  at  which  the  Markov  chain  goes 
from  any  state  s  to  s'  is  equal  to  the  rate  at  which  it  goes  from 
s'  to  s  ;  or  in  other  words,  it  states  that  the  Markov  chain  is  time 
reversible.  Now  it  is  well-known  that  a  necessary  and  sufficient  condition 
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for  time  reversibility  is  that  for  any  sequence  of  states  s,s\s*,  ....  s^.s 

the  transition  probabilities  must  satisfy  P  iP  i  ’  •••  P  .  * 

s,s  s  ,s“  s,s 

P  ^  •••  P  |  that  this  is  the  case  is  easily  verified  for  this  particu- 
s ,  s  s  ,  s 

lar  model.  ^For  instance  if  n  •  3  and  the  sequence  of  states  is 
(1,2, 3), (2, 1,3), (2, 3,1), (3, 2,1), (3, 1,2), (1,3, 2), (1,2, 3)  the  product  of 


the  transition  probabilities  going  from  left  to  right 
a2P2a3P3a',P3a3Pla',Pla3P  1  =  a2a3PlP"*P3  whereas  in  the 

it  is  a3P30l2P3a3P2a2P2a3PlJl2Pl  *  0l2a3PlP2P3 ) 


is 

reverse  direction 


Since  the  stationary  probabilities  are  obtained  from  the  set  of  equations 
(2.1)  which  do  not  depend  on  the  ,  we  have 

Theorem  2.3: 

The  average  cost  of  any  randomized  policy  based  on  the  transposition 
rule  is  independent  of  the  randomization  constants  a.  . 
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3.  TRANSPOSITION  Rl'LE  OPTIMALITY  WHEN  P,  -  p  ,  P  -  ?  ,  i  -  2 . n 

_ Ji _ l  n  -  1 _ 

In  this  section  we  shall  suppose  that  P,  »  p  ,  P-  =  • • •  =  P  =  ^  =  q  . 

1  2  n  n  -  1 

In  general  under  any  rule,  the  average  cost  can  be  obtained  by  analyzing  the 
Markov  chain  of  n!  states  where  the  state  at  any  time  corresponds  to  the 
ordering  at  that  time.  However  for  the  P  of  the  form  above,  as  all  of 
the  elements  2  through  n  are  identical  (as  they  have  the  same  probability 
of  being  requested)  we  can  obtain  the  average  cost  by  analyzing  the  much 
simpler  Markov  chain  of  n  states  with  the  state  being  the  position  of 
element  1. 

Consider  the  following  restricted  class  of  rules  which  when  an  element 
is  requested  and  found  in  position  i  ,  move  the  element  to  position  j 
and  leave  the  relative  positions  of  the  other  elements  unchanged.  In 
addition  we  suppose  that  j  <  i  for  i  >  1  ,  *  1  and  j .  ^  , 

i  »  2,  ....  n  .  The  set  {j  ,  i  *  1,  . . . ,  n}  characterizes  a  rule  in 
this  class. 

For  a  given  rule  in  the  above  class  let 

k(i)  -  max  U  :  jt+,  <_  1}  . 

In  other  words,  for  any  i  ,  an  element  in  any  of  the  positions 

i,i  +1,  ....  i  +  k(i)  will,  if  requested,  be  moved  to  a  position  less 

than  or  equal  to  1  . 

For  a  specified  rule  in  the  above  class  let  us  denote  the  stationary 
probabilities  when  this  rule  is  employed  by 


TTi  “  Pr  •,  is  in  position  i}  ,  i  =  1 ,  . . . ,  n 
n 

si  m  l  =  Pr(e  is  in  a  position  >  i}  ,  i  =  0,1,  ....  n  -  1  . 

j-i+1  J  i 

Before  writing  down  the  steady  state  equations  it  may  be  worth  noting 

the  following: 

(i)  Any  element  moves  toward  the  back  of  the  list  at  most  one 
position  at  a  time. 

(ii)  If  an  element  is  in  position  i  and  neither  it  nor  any  of 

the  elements  in  the  following  k(i)  positions  are  requested 
it  will  remain  in  position  i  . 

(iii)  Any  element  in  one  of  the  positions  i,i  +1,  ...,  i  +  k(i) 
will  be  moved  to  a  position  <_  i  if  requested. 

The  steady  state  probabilities  can  now  easily  be  seen  to  be: 

S1  '  Si+k(i)  +  (S1  -  Si+k(i)>a  -  P>  +  «!-!  -  V‘|k<1) 

or 


si =  aist-i +  (1  ~  ai)si+ka)  *  1  ■  1>  •••» n  - 1 


(3.1) 


S  =  1  ,  S  =0 
o  a 


where 


a 


i 


—sMi) _ 

qk(i)  +  p 


(3.2) 
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Now  consider  a  special  rule  of  the  above  class,  namely  the  trans¬ 
position  rule  which  has  j ±  -  t  -  1  ,  i  -  2 ,  . . . ,  n  ,  j  -  1  .  Let  the 
corresponding  be  denoted  bv  S!^  for  the  transposition  rule.  Then 

from  Equation  (3.1)  we  have,  since  k(t)  »  1  ,  that 


S' 


<?s’  +  ps; 


1-1  i+1 

p  +  q 


or,  equivalently. 


S'  -  S'  =  ^  (S'  -  s’  1 
i+1  5i  p  lii  bi-i; 


which,  iterating,  implies 


S' 

l+r 


-  S  ' 

1+r-l 


S ' 

Vi 


)  . 


Summing  the  above  equations  from  r  *  1,  r  we  obtain 


Now  consider  any  other  rule  R  of  the  considered  class  and  let  k(i)  be 
as  defined  for  that  rule.  Now  from  the  above  we  see  that  for  the  trans¬ 
position  rule 


Si+k(i) 


or,  equivalently 


(3.3) 


s; 


bisi-i  +  (1  '  bi)sl+k(i) 


where 
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(3.4)  b.  *  - (q/p)  4-  •••  +  (q/p)k(1)  i  m  x 

1  +  (q/p)  +  •••  +  (q/p) k  ^  ^ 

and  S'  =  0  ,  S'  =  1  . 
n  0 

We  are  now  ready  to  prove 
Theorem  3.1: 


If  p  1/n 

,  then  S'.  <  S. 

l  —  i 

for  all 

i  . 

If  p  ^  1/n 

,  then  S!  >  S, 
l  —  i 

for  all 

i  . 

Proof : 

Consider  the 

case  p  _>  1/n 

which  is 

equivalent  to  p  > 

that  in  this  case 

a  3  1  - 

1  i 

1 

i  1 

l+k(i)  q- 
P 

1  +  q/p 

+  •••  +  (q/p)k^ 

Now  define  a  Markov  chain  with  states  0,1,  ...,  n  and  transition 
probabilities 


(3.5) 


0,0 

|Ci  if  J 

(i  -  c.  if  j 


P  =1 
n,n 


=  i  -  1 


=  i  +  k(i) 


i  =  1, 


n  -  1  . 


Let  f ^  denote  the  probability  that  this  Markov  chain  ever  enters  state  0 
given  that  it  starts  in  state  i  ,  Then  f  satisfies 


fi  = 


Cifi-1  +  (1  ‘  Ci)fi+k(i) 


1, 


n  -  1 


fn  =  1  ,  f  =  0  . 
0  n 
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Hence,  as  it  is  well  known  that  the  above  set  of  equations  has  a  unique 

solution,  it  follows  from  (3.1)  that  if  we  take  c^  equal  to  a^  for  all 

i  ,  then  f.  will  equal  the  5.  of  rule  R  ,  and  if  we  let  c.  =  b.  , 
i  i  li 

then  f  equals  S!^  .  Let  (a)  and  (b^)  denote  the  state  at  time 
r  of  the  Markov  chain  defined  by  (3.5)  when  c  equals  respectively  a^  and 
b^  .  Now,  as  P{X1(£)  £  j  |  Xg(fL )  =  and  P{X^(b)  >_  j  |  X„(b)  =  i}  are  both 

increasing  in  i  ,  for  all  j  ,  and  as 

P(X1(a)  >  j  [  XQ (a )  =  i>  <  P{XL(b)  >  j  |  XQ(a)  =  i} 

for  all  j  ,  it  can  be  shown  (see  Theorem  4  of  [6])  that 

P(Xr(a)  >  j  i  XQ(a)  =  i>  <.PtXr(b)  >  j  |  XQ(b)  =  i}  . 

Hence , 

P{Xr(a)  =  0}  >_  P{Xr(b)  =  0)  for  all  r 

implying  chat 

S.  >  S'.  . 

i—i 

When  p  <_  1/n  ,  then  a^  <_  b^  ,  and  the  above  inequality  is  reversed.  J  j 

Thus  for  all  rules  in  the  considered  class,  the  asymptotic  position 
of  element  1  will  be  stochastically  larger  (smaller)  than  it  would  be 
under  the  transposition  rule  when  p  £  1/n  (p  £  1/n)  . 

Now  consider  any  cost  function  whose  cost  of  requesting  an  element 
in  position  i — call  it  g(i) — is  an  increasing  function  of  i  . 

Letting  X  denote  the  (asymptotic)  position  of  element  1,  we  have  that 
the  average  expected  cost  can  be  expressed  as 
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E[cost)  -  pE[g(X)]  +  (1  -  p)  -  -  &  *■  ~  1  +..  £  -S 

u  -  1 

The  above  follows  by  conditioning  on  whether  or  not  element  1  is  requested 
and  then  noting  that  if  element  1  is  not  requested,  then  any  of  the  re¬ 
maining  n  -  1  elements  are  equally  likely  to  be.  Hence, 

E(cost  ]  =  (p  -  )  E[g(X)]  +  (1  -  p)  ~a(1)  V-''/  S(n))  ’ 

and  thus  if  p  1/n  ,  the  expected  cost  is  minimized  by  minimizing 

E [ g ( X ) ]  and  if  p  1/n  by  maximizing  E [ g (X ) ]  .  But  as  stochastically 
minimizing  (maximizing)  X  is  equivalent  to  minimizing  (maximizing) 

E[ g (X) ]  ,  for  every  increasing  function  g  ,  it  follows  that  the  expected 
cost  is  minimized  by  the  transposition  rule. 
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